Fine-grained record integration and linkage tool.

نویسندگان

  • Pawel Jurczyk
  • James J Lu
  • Li Xiong
  • Janet D Cragan
  • Adolfo Correa
چکیده

BACKGROUND As part of the surveillance program to monitor the occurrence of birth defects in the metropolitan Atlanta area, we developed a record linkage software tool that provides latitude in the choice of linkage parameters, allows for efficient and accurate linkages, and enables objective assessments of the quality of the linked data. METHODS We developed and implemented a Java-based fine-grained probabilistic record integration and linkage tool (FRIL) that incorporates a rich collection of record distance metrics, search methods, and analysis tools. Along its workflow, FRIL provides a rich set of user-tunable parameters augmented with graphic visualization tools to assist users in understanding the effects of parameter choices. We used this software tool to link data from vital records (n = 1.25 million) with birth defects surveillance records (n = 12,700) from the metropolitan Atlanta Congenital Defects Program (MACDP) for the birth years 1967-2006. RESULTS Compared with the data linkage performed by conventional algorithms, the data linkage of birth certificates with birth defect records in MACDP using FRIL was more efficient. The linkage based on FRIL was also accurate, showing 99% precision and 95% recall. Based on positive user feedback, new features continue to be developed, and the tool is being adopted in several other data linkage projects in MACDP. CONCLUSIONS A software tool that allows significant user interaction and control, such as FRIL, can provide accurate data linkages for birth defect surveillance programs and allows an objective assessment of the quality of linked data.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

FRIL: A Tool for Comparative Record Linkage

A fine-grained record integration and linkage tool (FRIL) is presented. The tool extends traditional record linkage tools with a richer set of parameters. Users may systematically and iteratively explore the optimal combination of parameter values to enhance linking performance and accuracy. Results of linking a birth defects monitoring program and birth certificate data using FRIL show 99% pre...

متن کامل

Implementing One to Many Data Linkage Using One Class Clustering Tree

The task of data linkage is performed among entities of the same type. The one to one data linkage links one record from one table and another one record in another table. It is extremely necessary to develop linkage techniques that link between matching entities of different types and also to improve one to one linkage to one to many data linkage as well. The proposed method emphasizes on one-...

متن کامل

TAILOR: A Record Linkage Tool Box

Data cleaning is a vital process that ensures the quality of data stored in real-world databases. Data cleaning problems are frequently encountered in many research areas, such as knowledge discovery in databases, data warehousing, system integration and e-services. The process of identifying the record pairs that represent the same entity (duplicate records), commonly known as record linkage, ...

متن کامل

The Effect of Geopolymerization on the Unconfined Compressive Strength of Stabilized Fine-grained Soils

This study focuses on evaluating the unconfined compressive strength (UCS) of improved fine-grained soils. A large database of unconfined compressive strength of clayey soil specimens stabilized with fly ash and blast furnace slag based geopolymer were collected and analyzed. Subsequently, using adaptive neuro fuzzy inference system (ANFIS), a model has been developed to assess the UCS of stabi...

متن کامل

Ultra-Fine Grained Dual-Phase Steels

This paper provides an overview on obtaining low-carbon ultra-fine grained dual-phase steels through rapid intercritical annealing of cold-rolled sheet as improved materials for automotive applications. A laboratory processing route was designed that involves cold-rolling of a tempered martensite structure followed by a second tempering step to produce a fine grained aggregate of ferrite and ca...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Birth defects research. Part A, Clinical and molecular teratology

دوره 82 11  شماره 

صفحات  -

تاریخ انتشار 2008